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ABSTRACT 

RNA degradation can distort or prevent measure- 
ment of RNA transcripts. A mathematical model 
for degradation was constructed, based on 
random RNA damage and exponential polymerase 
chain reaction (PCR) amplification. Degradation, 
measured as the number of lesions/base, can be 
quantified by amplifying several sequences of a ref- 
erence gene, calculating the regression of C t on 
amplicon length and determining the slope. 
Reverse transcriptase-quantitative PCR (RT-qPCR) 
data can then be corrected for degradation using 
lesions/base, amplicon length(s) and the relevant 
equation obtained from the model. Several predic- 
tions of the model were confirmed experimentally; 
degradation in a sample quantified using the model 
correlated with degradation quantified using an 
additional control sample and the AACt method 
and application of the model corrected erroneous 
results for relative quantification resulting from deg- 
radation and differences in amplicon length. 
Compared with RIN, the method was quantitative, 
simpler, more sensitive and spanned a wider range 
of RNA damage. The method can use either random 
or specifically primed complementary DNA and it 
enables relative and absolute quantification of RNA 
to be corrected for degradation. The model and 
method should be applicable to many situations in 
which RNA is quantified, including quantification 
of RNA by methods other than nucleic acid 
amplification. 

INTRODUCTION 

Quantification of messenger RNA (mRNA) by reverse 
transcription followed by nucleic acid amplification, 
usually polymerase chain reaction (PCR) [reverse 



transcriptase-quantitative PCR (RT-qPCR)], is widely 
performed, but the results are very dependent on the 
quality of the RNA. This may vary widely, because 
RNA may be degraded by a variety of physical and 
chemical factors including heat, radiation, chemicals and 
tissue ribonucleases. It may be difficult to prevent these, 
particularly for stored tissue, fixed samples and archival 
specimens. Several approaches have been used to assess 
RNA quality. These include spectrophotometry, analysis 
of 18S and 28S rRNA by electrophoresis, analysis of the 
complete RNA pattern on electrophoresis (RIN, Agilent 
Technologies) (1), the 5'— 3' assay (2) and PCR amplifica- 
tion of different target lengths of complementary DNA 
(cDNA) (3,4). Generally, these methods provide a quali- 
tative or semi-quantitative measure of RNA integrity 
rather than a quantitative measure, although modification 
of the RIN algorithm has been claimed to improve RT- 
qPCR quantification (5), and Gong et al. (4) suggested 
quantitative conclusions from their data. A qualitative 
measure of integrity may indicate that the sample is 
adequate for analysis, but a quantitative measure 
enables quantification of an RNA target to be corrected 
for degradation. 

We recently described a method for quantifying the 
integrity of genomic DNA in a sample by determining 
the probability that a base is damaged and the fraction 
of target molecules that are intact and amplifiable by 
qPCR. This information enables the true number of 
target molecules in a sample to be determined (6). 
The method assumed that DNA damage was random, 
and the model combined the mathematics of the 
Poisson distribution and the mathematics of exponen- 
tial amplification. We have now used a similar 
approach to model RNA degradation, enabling quanti- 
fication of RNA integrity. Hence RT-qPCR can be 
corrected for degradation. Modelling RNA is more 
complicated owing to the additional step of reverse 
transcription. Nevertheless, the results are consistent 
with the model and suggest that the same principles 
can be used. 
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In this article, we describe the mathematical model 
underlying the method, present experiments testing predic- 
tions of the model, illustrate its use for relative quantifi- 
cation and finally discuss practical applications. 

Mathematical analysis 
Poisson statistics 

A lesion is defined as damage to an RNA or cDNA strand 
which prevents formation of an intact cDNA target for 
PCR. The basic assumption of the model is that lesions 
occur randomly and independently. This assumption 
seems undoubted when considering external physical or 
chemical agents, which damage RNA by hydrolytic, 
phosphorolytic or thermodynamic cleavage, or by the 
random production of adducts. However, RNA can also 
be degraded by the action of a large number of ribonucle- 
ases, either endoribonucleases or exoribonucleases. 
Endoribonucleases may show some base or sequence spe- 
cificity. But, in relation to the total RNA strand, bases 
and/or short sequences occur at random, so enzyme 
activity can also be regarded as random. The randomness 
or non-randomness of exoribonucleases is difficult to 
assess because there are many enzymes and a variety of 
mechanisms. However, for most exoribonucleases and for 
most RNA sequences, the RNA strand degraded by the 
enzyme is completely degraded (7) and we are not aware 
of any compelling evidence that this occurs in a 
non-random fashion. In view of the above considerations, 
we regard the great majority of RNA degradation as 
occurring randomly. 

In considering RNA degradation in relationship to RT 
qPCR, there are three steps during which degradation may 
influence the relationship between the initial number of 
RNA targets and the final number of cDNA targets meas- 
urable by qPCR. These factors are prior degradation 
of the RNA in the sample being assayed, degradation of 
RNA by RT or early termination of cDNA synthesis and 
degradation of the cDNA. There is an individual probabil- 
ity that a target molecule will survive each of these steps 
and the overall probability that an initial target molecule 
will result in a final intact cDNA molecule is the product 
of the three individual probabilities. 

Degradation of RNA in the sample. For an RNA target 
that is to be quantified by RT-qPCR, the target for deg- 
radation that needs to be considered is shown in Figure 1 . 
It is the RNA segment that starts at the 3'-RNA base that 
hybridizes to the most 5'-base of the reverse transcription 
primer and extends to the 5'-RNA base that corresponds 
to the 3'-end of the PCR target, in the cDNA. The length 
of this segment is (/ + p) bases, where / is the length of the 
PCR target and p is the length of the cDNA strand from 



its 5'-end to the beginning of the PCR target. For random 
priming, p is a mean value. 

The probability that a given number of lesions will 
affect the RNA segment is described by the binomial dis- 
tribution. If the mean number of lesions/base in RNA is 
r\, then the probability that there will be no lesions affect- 
ing the RNA segment is (1 — r\f l+p \ When r\ is very small, 
the Poisson distribution provides a good approximation 
to the binomial distribution. The probability Pi(0) of 
no lesions in the segment is the zero term of a Poisson 
distribution where fi is the mean number of lesions in 
the strand. Thus, 

P 1 (0) = e^ = e- (/+/,)rl . 

Degradation of RNA by RT. RT has both polymerase and 
ribonuclease activity. A stem-loop secondary structure in 
the RNA template may block polymerase extension, but 
this only occurs with occasional targets and is minimized 
by the use of a RT, such as Superscript III, which 
can operate at a relatively high temperature. Impaired con- 
version of RNA targets to cDNA targets as a result of 
general impairment of processivity is unlikely, given 
the processivity of the enzyme (2400 bases/min for 
Superscripts II and III) and the long duration of the RT 
phase of RT-qPCR (30^10 min). Degradation of RNA by 
the RNase activity of RT is a possibility. Naturally 
occurring RTs all have RNase H activity, but this has 
been decreased or eliminated in genetically engineered 
forms of the enzyme which are now widely used when per- 
forming RT-qPCR. Although degradation of RNA by RT 
is unlikely to be of major importance, we have 
incorporated it into our model by introducing the term 
r 2 , the probability/base that RT will cause RNA strand 
breakage and thus failure to complete cDNA synthesis. 
Again, as in the previous section (a), the probability P(0) 
that an intact RNA strand will be converted into an intact 
cDNA target during reverse transcription is given by 

P 2 (0) = e - (l+p)r2 . 

Degradation of cDNA. If r 3 is the probability/base that 
the cDNA strand will suffer a lesion, which prevents poly- 
merase amplification, then, as in the previous sections, the 
probability P(0) that a synthesized cDNA strand will 
persist as an intact target for PCR is given by: 

P 3 (0) = e-' r3 . 

However, degradation of cDNA prior to performance 
of qPCR is unlikely to be quantitatively important, owing 
to the relative stability of DNA and as qPCR is usually 
performed immediately after the reverse transcription 
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Figure 1. The regions of RNA and cDNA used for the mathematical model, and the locations of variables / and p, used in that model. / is the length 
of the PCR product; p is the length of the region of cDNA, which extends from its 5'-end to the upstream PCR primer binding site. 
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step. For this reason r 3 approximates 0 and P 3 {Q) approxi- 
mates 1. 

Multiplying i>i(0), P 2 (0) and P 3 {0) gives the overall 
probability P{0) that an original RNA target will be con- 
verted to a cDNA target that is ampliflable and quantifi- 
able by the qPCR. Thus, 

p(0) = e ~(l+PHrl+r2) 
_ £ -(l+p)r 

where r = i\ + r 2 

If the ampliflable fraction (AF) is the proportion of 
RNA target molecules that are undamaged and which 
result in corresponding cDNA molecules ampliflable in 
the PCR, then 

AF = P(0) = e~ (,+p)r (1) 

Thus, for any RNA sample, AF of a target can be 
calculated if / and p are known and by measuring r. The 
method for measuring r is described below. 

Exponential amplification 

During exponential amplification 

N c = N 0 a c , 

where a is the amplification efficiency, defined as the pro- 
portional increase per cycle; c is the number of cycles; 
N 0 is the initial number of sequences being amplified 
and N c is the number of amplified sequences present 
after c cycles. When c = C t , the threshold number of 
cycles for real-time PCR 

N Cl = N 0 a Ct . (2) 

If m is the mass of cDNA initiating the PCR and z is the 
mass containing one instance of the target sequence, 

N Cl = {m/z)-AF-a a . (3) 

From Equations (1) and (3) 

N a = m/z.e- (l+p)r ■ a Ct . 

Rearranging and taking logarithms to base e 

C t ■ log e a = {l+p)r+\og e {N a ■ z/m) 

Since Net, z and m are constants, the equation has the 
following form 

C t ■ log e o = (/+p)r+constant. (4) 

There is therefore a linear relationship between C t log e a 
and {l+p), with slope r. When random priming is used for 
reverse transcription, p has a mean value that is the recip- 
rocal of the probability that priming will commence at any 
one base. The value of r can then be directly calculated 
as the slope of the linear relationship between C t log e « 
and /. When gene-specific or poly-dT priming is used, 
the value of p for each length is known, and the value of 
r can then be calculated as the slope of the linear relation- 
ship between C t log e a and {l+p). When random priming 
is used, the value of p can be determined by using 
Equation (4), which indicates that, when / equals —p, the 



value of C t becomes constant and independent of r. This 
enables the value of p to be determined by studying RNA 
samples degraded to different extents (see 'Results' section 
and Figure 2). 

Although the value of a can be empirically determined 
for each sequence length, it is much simpler to use a 
robust, efficient PCR amplification system, for which the 
value of a is known and is constant over the range of 
sequence lengths studied. Optimization of the PCR condi- 
tions enables this be achieved for a reasonable range of 
amplicon sizes. If a is constant, then the slope of C t versus 
/, or C t versus {l+p), can be determined, and 

r = slope ■ log e a. (5) 

In most situations, RNA targets of interest are 
quantified relative to a standard gene in the same 
sample. If, in the sample, yV ta r is the initial total number 
of molecules, intact and degraded, of a target to be 
quantified, then from Equations (1) and (2) 

N Cl = N tm ■ e^ ,+p)r .a c '. (6) 

The assumptions for relative quantification of a target 
sequence (subscript tar) against a standard reference or 
'housekeeping' sequence (subscript st) are that N C t and a 
are the same for both target and standard. Since both 
target and standard are quantified in the same RNA/ 
cDNA sample, r is also the same. If p is also the same, 
as is the case for random priming, then 

N tar /N st = e -('«-'-)-r . a (Ct«-Ct lar ) (?) 

This is the familiar AAC t calculation (8), corrected for 
degradation which takes account of product length. For 
gene-specific or oligo dT priming, / needs to be replaced by 
/ + The value of r is most simply obtained by amplifying 
several different lengths of a reference sequence in the 
test sample and applying Equation (5). The reference se- 
quences can be the same gene as the target, but is most 
conveniently a separate sequence for which amplification 
efficiency has been previously well characterized. If 
relative quantification is performed using the geometric 
mean of results for several standards (9), then / st is the 
arithmetic mean of the lengths of those standards. 

In the uncommon situation where an RNA sequence in 
a test sample (subscript t) is quantified relative to the same 
RNA sequence in an external reference sample (subscript 
e), r t and r e must be determined. This can be done 
by amplifying several lengths of the sequence of interest, 
or of a separate reference sequence, in each sample 
and applying Equation (5). The ratio of AF of the target 
sequence in the test sample, to AF in the external samples, 
can then be determined, because from Equation (1) 

AF,/AF e = e-a+P)-^). 

This ratio can then be used to correct the RT-qPCR 
results for degradation. 

With gene-specific or oligo-dT priming, the value of r 
can also be obtained by amplifying two sequences, one at 
the 3'-end and one at the 5'-end of a cDNA sequence. 
Assuming that the number of amplicons at threshold is 
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Figure 2. RNA degradation by heat. Nine RNA samples were held at different temperatures for 30 min and then analysed by RT-qPCR. (A) The C t 
values for the different samples. (B) The calculated values for lesions per 1000 bases. In A, the regression lines for the C t values have been 
extrapolated leftwards to estimate the value of p. This value equals -/, where / is the amplicon length at which the regression lines intersect. 
In this experiment, which used random priming for reverse transcription, intersects mostly fell between —300 and —500 bases. 



the same for the two probes, rearrangement of Equation 
(6) leads to the relation 

r = (log e a ■ (Ct v - Ct 5 ,))/((l v +p y ) - (1 5 >+P5>)), 

where the subscripts 3' and 5' refer to the PCR amplifica- 
tions of the sequences at the 3'- and 5'-regions of the 
cDNA. This equation reduces to 

r = (log e a ■ (Ct v - Ct 5 ,))/(ly+k), (8) 

where l\ is the length of the sequence intervening between 
the most 5'-base of the 3'-sequence and the most 3'-base of 
the 5'-sequence. 

MATERIALS AND METHODS 

Blood samples were collected from a healthy volunteer, 
into EDTA, and RNA was extracted immediately. RNA 
extraction used the QiaAmp RNA Mini Kit (Qiagen). 
In brief, erythrocytes were lysed in hypotonic media; 
leukocytes were pelleted, then lysed and nucleic acids 
precipitated with ethanol; RNA was bound to a spin 
column; the column was washed, then RNA was eluted 
in sterile RNAse-free water. RNA was either used imme- 
diately or aliquotted and frozen at — 80°C. Aliquots, once 
thawed, were not re-frozen. 

Unless stated, all studies used the NRAS gene (neuro- 
blastoma RAS viral oncogene homologue, genelD 4893, 
MIM 164790). Total RNA was reverse transcribed with 
Superscript III (Invitrogen) and RNAseOUT recombinant 
ribonuclease inhibitor (Invitrogen), with either specific 
primers or random hexamers, according to the manufac- 
turer's protocol. Primers were first annealed to RNA by 
heating to 65°C for 5 min and then cooling on ice for at 
least 1 min. Reverse transcription was in 25 fi\ of 50 mM 
Tris-HCl pH 8.3, 75 mM KC1, 3mM MgCl 2 , 5mM DTT, 
40 units RNAseOUT, 200 units Superscript III, 0.5 mM 
each dNTP and total RNA from up to 120/j.I blood (i.e. 
500,000 leukocytes). If random primers were used, the 



mixture was then incubated at 25° C for 5 min. Reverse 
transcription was at 50°C for 30 min. After which, the 
reaction was stopped by heating to 70°C for 15 min. 
cDNA was used immediately or stored at — 80°C. 

PCRs were performed in 25 fil volumes containing 1-2 
U Platinum Taq (Invitrogen), 20 mM Tris-HCl pH 8.4, 
50 mM KC1, 5mM MgCl 2 , 300 /zM each of dATP, dGTP, 
dCTP or dUTP; 50 ng each primer; up to 0.4 (A cDNA 
and 20 ng (2-2.5 pmol) hydrolysis probe. Reactions were 
manually set up in 96-well 0.2 ml Axygen PCR microplates 
with strip clear flat top caps (Axygen, Union City, CA, 
USA). Cycling conditions were as follows: an initial de- 
naturation at 96°C for 2 min; then up to 55 cycles of 
94°C for 15s; 55°C for 90s and 72°C for 60s. Cycling 
was done on a BioRad IQ5 thermal cycler, running IQ5 
V2. 0.148.60623 software. The features felt to be of par- 
ticular importance for efficient amplification were the high 
concentrations of MgCl 2 and dNTPs and the prolonged 
annealing time. 

The primers used are shown in Table 1. To generate a 
set of PCR products of varying lengths — all RT-qPCRs 
used the same probe, and the same primer 5' of the probe 
(extension of the primer hydrolyses the probe). The re- 
maining primers were designed to give amplicons 
between 80 and 400 bases in length. Primers were aimed 
for a T m of >51°C. (nearest neighbor method) under PCR 
reaction conditions. For the RIN work, duplicate aliquots 
of RNA were incubated at temperatures between 55 and 
92° C for 30 min, and then one aliquot was reverse 
transcribed and analysed as above; the other analysed by 
an Agilent 2100 Bioanalyser (Agilent Technologies), with 
an RNA 6000 Pico Labchip Kit. Amplicon sizes: sizes 
stated here include the primers. 



RESULTS 

In the previous study of DNA degradation, using an 
optimized amplification protocol, the value of a was 
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Table 1. 


primer sets for 


assessing degradation in RNA/cDNA 


NRAS 


probe 


VIC-attgtcagtgcgcttttcccaacaccacctg-BHQ 




Fl 


cacttgttttctgtaagaatcctct 




F2 


actcgcttaatctgctccctgt 




F3 


aaaaagcatcttcaacaccctgtc 




F4 


tcttttctgacaaaactttaaaagtatcttg 




F5 


gaaatgactgagtacaaactggtggtg 




F6 


ccggctgtggtcctaaatctgtc 




F7 


gggctgttcatggcggttcc 




Rl 


aggcagtggagcttgaggttc 




R2 


gaatcctctatggtgggatcatattcatct 


GAPDH 


Probe 1 


FAM-ccatcaccatcttccaggagcgagatccctc-BHQ 




Probe 2 


FAM-ccatccacagtcttctgggtggcagtgatg-BHQ 




Fl 


gagaacgggaagcttgtcatcaatgg 




F2 


gaccccttcattgacctcaactacatgg 




F3 


tgcaggggggagccaaaagg 




F4 


cagcctcaagatcatcagca 




F5 


ccatgacaactttggtatcg 




F6 


gtggaaggactcatgaccac 




Rl 


ccagcatcgccccacttgatt 




R2 


atggtggtgaagacgccagtgg 




R3 


ggggcagagatgatgacccttttgg 




R4 


ggttcacacccatgacgaacatgg 




R5 


aggaggcattgctgatgatcttgagg 




R6 


gtccttccacgataccaaagttgtcatgg 




R7 


cacgccacagtttcccggag 


A PC 


Probe 1 


FAM-aagcagaattagatgctcagcactt-BHQ 




probe 2 


FAM-aagtgctgagcatctaattctgctt-BHQ 




Fl 


tgccatctcttcatgttagg 




F2 


ggaatctcatggcaaatagg 




F3 


gccaatattatgtctcctgg 




F4 


rtpa p'l a 'VI 1 (T'T 1 t (TP t '■! 1 (TIT 
gLaLaaaaLgu L I -L\. la. L Li. LI 




F5 


ggaagcattatgggacatgg 




Rl 


actacgatgagatgccttgg 




R2 


catcatgtcgattggtgtc 




R3 


aaggacagtcatgttgccag 




R4 


ttcctcttgatgaagaggag 




R5 


ctagaccaattccgcgttct 




R6 


gccttgggacttaaattgtc 



found to be constant over the range of lengths studied for 
the NRAS sequence and for four other sequences (6). This 
study also used the same amplification protocol and the 
NRAS gene and the same probe, but different target se- 
quences, and primers on exon-exon boundaries. 

In four studies involving products with lengths from 120 
to 458 bp, the mean amplification efficiency/cycle was 1.89 
and the regression line between amplification efficiency 
and length had a slope of —5.2 x 10~ 5 , which was not 
significantly different from zero. 

The number of lesions per 1000 bases was determined in 
33 studies, 9 on freshly prepared RNA and 24 on RNA 
that had been frozen at — 80° C and thawed for study. In 
the initial 12 studies, the regression line of C t versus 
amplicon length was determined using at least three dif- 
ferent amplicons, but subsequent studies generally used 
only two amplicons. The mean value for lesions per 
1000 bases was 1.47 for the fresh samples and 1.63 for 
the frozen samples but the difference was not significant 
(7-test, P > 0.05) and the results were pooled. The mean 
(SD) value for the 33 studies was 1.59 (0.61) and the mean 
value was significantly different from 0 (t = 2.58, P < 0.01, 
one-tailed /-test). 

In a total of 22 studies, RNA degradation was produced 
by heating RNA for 30 min, and in every case an increase 
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in the measured value of degradation was observed. In each 
of the three experiments, nine different temperatures were 
used, and in each case increasing temperature produced 
increasing degradation; the results from one of the three 
experiments are shown in Figure 2. The left panel shows 
the various regression lines for the relationship between C t 
and /, and the right panel shows the calculated numbers 
of lesions/base, in relation to temperature. In these 
studies, reverse transcription was initiated by random 
priming and thus the value of p was not known a priori. 
However, our model indicates that, when / equals -p, the C t 
value should be independent of r, and it therefore predicts 
that the various regression lines should intersect at around 
the value of —p. This prediction was observed in each 
study in which RNA was degraded by heating. The 
points of intersection were between —200 and —500 
bases, although there was marked experimental variation, 
presumably owing to variation in C t results having a 
marked effect on the slopes of the regression lines. 

To further test the model, we experimentally mani- 
pulated p and observed the effect on the point of intersec- 
tion between the regression line for control RNA, and the 
regression line for RNA degraded by heating. This was 
done in two ways, (i) Manipulation of random priming 
was performed in four experiments. Since the probability 
of transcription being initiated at any one base is the re- 
ciprocal of p, we reasoned that varying the concentration 
of random primers would influence the probability of the 
initiation of reverse transcription and hence the value of p 
as determined from the intersection point would decrease 
as the primer concentration increased. Of the four experi- 
ments, one used 8, 60 and 200 ng of hexamers; the other 
three used 4, 20 and 100 ng hexamers. Regression lines 
were based on the results from amplicons of two 
lengths, each amplified in triplicate. For the four experi- 
ments, the estimated values of p were: 224, 107 and 97 bp; 
286, 283 and 161 bp; 1741, 187 and -57 bp; 422 214 and 
174 bp. Despite the experimental variation, in each experi- 
ment, an increasing concentration of hexamers produced a 
decreasing estimate for (ii) Manipulation of 

gene-specific priming was performed in one experiment. 
Three values of p were produced by using three reverse 
transcription primers, and for each primer, a series of 
PCR primers was used in order to determine the intersec- 
tion point between the regression lines for control and 
degraded DNA and thus estimate the value of p for that 
transcription primer. Three amplicons were PCR 
amplified each in triplicate. The expected values for p 
were 0, 220 and 338 bp; the observed results were 8, 118 
and 408 bp. Thus the results of the two approaches to 
manipulating /; were consistent with the predictions of 
the model. 

In another four experiments using control and heated 
RNA (data not shown), various combinations of 
gene-specific primers were used for initiation of reverse 
transcription to produce various values for (l+p). These 
experiments directly examined Equation (3), which 
predicts a linear relationship between C t and (l+p) and 
predicts an intersection point when (l+p) equals zero. The 
four intersection points were at— 14, —13, 32 and 119 
bases. Given the imprecision of measurement of the 
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lesions / 1000 base 
derived from AACt 

Figure 3. Correlation between measuring RNA integrity by the present method and by the AAC t method. The RNA in 30 samples was degraded by 
heat. The present method quantified degradation by studying the degraded sample only, whereas the AAC t method quantified degradation using the 
Ci difference between the degraded sample and a control undegraded sample of the same RNA. 



intersection point, these results are consistent with the pre- 
diction of the model. 

When a control sample of RNA is available, the extra 
lesions produced by heating can be determined by RT 
qPCR, by amplifying the same sequence in both 
samples, noting the C t difference between the control 
and heated sample and calculating using the AAC t 
method. From the results of the various experiments in 
which RNA was degraded by heat and in which the 
control sample was undegraded RNA, it was possible to 
compare for each heated sample, the lesions per base as 
determined by the AAC t method with the lesions per base 
as determined by using Equation (5) of our model. The 
results of this comparison, shown in Figure 3, indicate a 
highly significant correlation (lesions per base: 
model = 0.9006 x [AAC t ] + 0.73; r = 0.89, P< 0.0005). 

In one experiment, the present method was compared 
with the RIN method for quantification of RNA damage 
produced by heating. The results are shown in Figure 4. 
Except possibly for a small window, the RIN method sug- 
gested that the RNA was either completely intact or com- 
pletely damaged, whereas the present method showed 
progressively increasing RNA degradation as the degree 
of heating was increased. In the samples that RIN 
indicated were completely damaged, the present method 
was able to distinguish different degrees of damage and 
indicated that most small targets were still intact. 

To determine whether our model could be applied to 
other gene targets, we investigated two other genes, 
GAPDH (glyceraldehyde-3-phosphate dehydrogenase; 
Gene ID2597; MIM 138400) and APC (adenomatous 
polyposis coli, genelD 324, MIM61 1731) and for each 
gene, synthesized two series of primers, with each series 
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Figure 4. Comparison between the method described here for measur- 
ing lesions/base, and the RIN score, which is based on automated 
electrophoresis and which gives a result between 1 (totally degraded) 
and 10 (fully intact). Nine RNA samples were partially degraded by 
heating at various temperatures for 30min and then analysed. The 
number of lesions/base shows a progressive increase as the temperature 
of heating increases whereas RIN shows a largely 'all-or-nothing' 
response. 



comprising 1 upstream primer and 5 or 6 downstream 
primers. Amplicon sizes range overall, from 63 to 
417 bp. Within each series, the smallest amplicon was 
between 63 and 182 bases; and largest amplicon was 
155-249 bases longer. When the integrity of RNA in a 
sample was investigated with these four series of 
primers: the mean amplification per cycle was 1.97, 1.91, 
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Table 2. Relative quantification of different lengths of GAPDH mRNA relative to different lengths of APC mRNA as the standard 
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Relative quantification was performed either by the conventional method or after quantifying degradation and then applying Equation (7). The RNA 
was either control RNA or RNA degraded by heating at 91°C for 30 min. The results obtained by conventional relative quantification are influenced 
by the length of the test amplicon, the length of the standard amplicon and the presence of degradation. These effects disappear after correction for 
degradation and length. For degraded RNA, in each experiment correcting the results of conventional relative quantification resulted in a highly 
significant (P < 0.0005) decrease in their variance; for control RNA correction produced a decrease in variance in each case but the decreases were 
not significant. 



2.07 and 2.07, respectively; the slopes of the relationship 
between amplification efficiency and amplicon length were 
-6.1 x 10 , 3.3 x 10 -5 , -4.0 x 10 -4 and -8.2 x 10 -4 , re- 
spectively, with none of these values being significantly 
different from 0; and the calculated lesions per 1000 
bases were 5.6, 4.6, 5.5 and 3.1, respectively. 

In six further experiments, we used the three genes to 
measure lesions per 1000 bases in control RNA and RNA 
degraded by heating and to calculate relative quantifica- 
tion of each amplicon of two genes using each amplicon of 
the third gene in turn as a standard. Relative quantifica- 
tion was calculated either conventionally or by using our 
model. For application of our model, degradation was 
measured by applying Equation (5) to the C t values 
obtained from amplification of the various amplicons of 
each gene, and relative quantification was then calculated 
using Equation (7). The results in all experiments were 
similar and were not affected by which gene was used 
for test RNA and which for the standard. For conven- 
tional relative quantification, if the standard amplicon 
was constant then the result for the test amplicon 
decreased with increasing amplicon length and, con- 
versely, if the test amplicon was constant then the result 
increased as the length of the standard amplicon 



increased. These effects were increased by degradation 
but were absent when our model was used to calculate 
relative quantification. The results of quantification of 
GAPDH relative to APC from three consecutive experi- 
ments are shown in Table 2. 

To determine the minimum amount of RNA needed for 
our method, we performed experiments each using serial 
dilution of RNA, reverse transcription using random or 
gene-specific priming and assessment of degradation. 
GAPDH was used in these experiments because it had the 
highest RNA level. The results of these experiments are 
shown in Table 3. For undegraded RNA quantification of 
integrity required ~ 1 pg of RNA; for degraded RNA, the 
amount of RNA required increased as the extent of degrad- 
ation increased. For degraded RNA, less RNA was 
required when gene-specific priming was used, presumably 
because the zero value of/> resulted in less degradation. 



DISCUSSION 

The model provides a simple method for quantifying 
RNA integrity and hence improving quantification of 
RNA by RT-qPCR. Quantification of RNA integrity is 
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Table 3. The least mass of RNA required for quantification of 
degradation 
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based on amplifying different lengths of cDNA sequences 
of a reference transcript. The method assumes that the 
amplification efficiency is independent of length and is 
known. Our protocol for fulfilling this assumption for 
genomic DNA has been reported (6) and is also used 
here, but any protocol which fulfils this assumption 
would be suitable. 

Although the model covers quantification of RNA 
using an external sample, in nearly all cases target RNA 
is quantified relative to one or more standard RNAs in the 
same sample. The model enables RNA integrity to be 
determined by PCR amplification of several lengths of a 
reference sequence, using the same probe and with appli- 
cation of Equation (5). Little extra work is involved par- 
ticularly for an experiment in which multiple RNA species 
are being quantified. Once RNA integrity has been 
quantified, target RNA sequences can be quantified, by 
applying Equation (7). If desired, an RNA standard can 
also be used as a reference providing that the amplification 
characteristics of different lengths of the standard are 
known. A panel of RNA reference genes, rather than a 
single reference gene, can also be used. 

Equation (7) indicates that correction for RNA degrad- 
ation becomes increasingly important as the difference 
between the length of the target and the reference in- 
creases, and/or as the degree of degradation increases. 
(Here, length is / for random priming and l+p for 
gene-specific or oligo dT priming). This theoretical predic- 
tion was confirmed experimentally as the effects of length 
and degradation were observed when relative quantifica- 
tion was calculated conventionally but were not observed 
when relative quantification was calculated according 
to our model (see 'Results' section and Table 2). The ex- 
perimental results and the underlying theory indicate the 
desirability of locating the primers for the target and 
standard, so that / or l+p are similar for each. At times, 
in addition to quantification, the absolute level of detec- 
tion of an RNA may also be important. The model 
suggests that minimizing l+p will improve detection. 
The value of / can be decreased by using short amplicons. 
The value of p can be decreased by improving 
hybridization of the reverse transcription primer, either 
by increasing its concentration or increasing its T mi or 
by using a gene-specific primer. 
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Current methods for assessing RNA degradation 
only give a qualitative or semiquantitative result and are 
generally used only to decide whether the sample is suffi- 
ciently intact to enable further analysis. Perhaps the most 
widely used current approach for assessing RNA degrad- 
ation is RIN, which assesses mRNA indirectly, by assess- 
ing ribosomal RNA. Our method has a number of 
advantages over RIN since: it gives a quantitative result 
which enables RT-qPCR results to be corrected for deg- 
radation; it enables gene expression to be quantified in 
samples which RIN would suggest were too degraded to 
be analysable; it requires over an order of magnitude less 
RNA than RIN; and it is simpler, cheaper and does not 
require a dedicated instrument. 

A recent detailed review (10) of the effects of RNA 
degradation on the results of gene expression studies, 
concluded that RNA quality influences results and that 
some improvement could be obtained by performing 
relative quantification using the geometric mean of 4 
standard housekeeping genes or by performing a 5—3' 
assay on cDNA produced by oligo dT priming. Our 
model is compatible with either one or a number of 
genes as standards but, as indicated in Equation (7), 
length and degradation both need to be considered. The 
5-3' assay as originally described (2) provides only a 
semiquantitative indication of degradation, but Equation 
(8) indicates that it can provide a truly quantitative 
measure of degradation, which can then be used in 
Equation (7) to calculate relative quantification. The 
5—3' assay has the advantage that the amplicons for 
qPCR can be quite short, which minimizes the risk of 
decreased amplification efficiency associated with long 
amplification, but the method assumes that the fluores- 
cence thresholds for the two probes will be reached at 
the same number of amplified molecules. Furthermore, 
when the experiment involves random priming, it would 
seem simpler to determine r by amplifying several lengths 
of a reference sequence, and thus avoid having to perform 
additional oligo dT or gene-specific priming. 

Our model is based on several assumptions, which are 
not unique to it. RNA degradation may not always be 
random; the efficiency of random priming may not be 
the same for all genes; the efficiency of amplification and 
the number of amplicons at threshold may not be the same 
for all genes quantified; and secondary structure may 
affect reverse transcription and cause under-estimation 
of transcript number. Nevertheless, these assumptions 
seem reasonable approximations and, provided they are 
borne in mind, we believe that our model will provide a 
useful tool for quantifying RNA integrity and improving 
the quantification of gene expression. 

Although our model has been constructed and tested 
from the perspective of PCR, mRNA can be quantified 
by a number of methods which do not involve nucleic acid 
amplification. For any method, there will be a critical 
region of RNA, of length L bases, which is characterized 
by the property that a lesion of any base within it will 
affect quantification whereas a lesion of any base outside 
it will have no effect. For PCR, L is the same as l+p . Our 
model will apply to any method in which the value of 
L varies for different genes. Several methods involve 
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hybridization at one point along the RNA strand and 
quantification of another region at a variable distance 
along the RNA strand. For example: for microarray, 
length variation will occur if the labelled cDNA is 
produced by oligo-dT priming or if there is any variation 
in the length of the hybridization probes; for SAGE and 
related methods, length variation will occur owing to vari- 
ation between the point of oligo-dT capture and the point 
of restriction enzyme digestion; for Nanostring, length 
variation will occur if there is variation between the 
points of hybridization of the capture probe and the meas- 
urement probe; and for RNA-Seq, if capture by oligo-dT 
is used, prior degradation will influence the frequency of 
reads along the RNA strand. With each method, our PCR 
method may be used to quantify RNA degradation and 
the results of relative quantification may then be corrected 
using Equation (7) and substituting L for / in that 
equation. RNA-Seq also enables RNA degradation to 
be measured directly, by measuring the frequency with 
which reads are recovered along the cDNA strand or, al- 
ternatively, degradation does not need to be separately 
measured if quantification for all genes is performed by 
determining read frequency at the same distance from the 
polyA region. 
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